Identification of protein coding genes in genomes with statistical functions based on the circular code.
نویسندگان
چکیده
A new statistical approach using functions based on the circular code classifies correctly more than 93% of bases in protein (coding) genes and non-coding genes of human sequences. Based on this statistical study, a research software called 'Analysis of Coding Genes' (ACG) has been developed for identifying protein genes in the genomes and for determining their frame. Furthermore, the software ACG also allows an evaluation of the length of protein genes, their position in the genome, their relative position between themselves, and the prediction of internal frames in protein genes.
منابع مشابه
Identification of circular codes in bacterial genomes and their use in a factorization method for retrieving the reading frames of genes
We developed a statistical method that allows each trinucleotide to be associated with a unique frame among the three possible ones in a (protein coding) gene. An extensive gene study in 175 complete bacterial genomes based on this statistical approach resulted in identification of 72 new circular codes. Finding a circular code enables an immediate retrieval of the reading frame locally anywher...
متن کاملComputational Identification of Micro RNAs and Their Transcript Target(s) in Field Mustard (Brassica rapa L.)
Background: Micro RNAs (miRNAs) are a pivotal part of non-protein-coding endogenous small RNA molecules that regulate the genes involved in plant growth and development, and respond to biotic and abiotic environmental stresses posttranscriptionally.Objective: In the present study, we report the results of a systemic search for identifi cation of new miRNAs in B. rapa using homology-based ...
متن کاملOPTIMAL SENSOR PLACEMENT FOR MODAL IDENTIFICATION OF A STRAP-BRACED COLD FORMED STEEL FRAME BASED ON IMPROVED GENETIC ALGORITHM
This paper is concerned with the determination of optimal sensor locations for structural modal identification in a strap-braced cold formed steel frame based on an improved genetic algorithm (IGA). Six different optimal sensor placement performance indices have been taken as the fitness functions two based on modal assurance criterion (MAC), two based on maximization of the determinant of a Fi...
متن کاملEvaluation of First and Second Markov Chains Sensitivity and Specificity as Statistical Approach for Prediction of Sequences of Genes in Virus Double Strand DNA Genomes
Growing amount of information on biological sequences has made application of statistical approaches necessary for modeling and estimation of their functions. In this paper, sensitivity and specificity of the first and second Markov chains for prediction of genes was evaluated using the complete double stranded DNA virus. There were two approaches for prediction of each Markov Model parameter,...
متن کاملCircular RNA: features, functions and their correlation with diseases especially cancer
In early 2012, the world of science saw a fascinating discovery called circular RNA as a transcription product of thousands of genes in mice and humans. These circular RNAs have recently been grouped as the encoding RNA in an independent group that their remarkable difference with other RNAs is that these RNAs are not linear, in which two ends connect with a covalent connection creating a loop-...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bio Systems
دوره 66 1-2 شماره
صفحات -
تاریخ انتشار 2002